Many applications of representation learning, such as privacy preservation, algorithmic fairness, and domain adaptation, desire explicit control over semantic information being discarded. This goal is formulated as satisfying two objectives: maximizing utility for predicting a target attribute while simultaneously being invariant (independent) to a known semantic attribute. Solutions to invariant representation learning (IRepL) problems lead to a trade-off between utility and invariance when they are competing. While existing works study bounds on this trade-off, two questions remain outstanding: 1) What is the exact trade-off between utility and invariance? and 2) What are the encoders (mapping the data to a representation) that achieve the trade-off, and how can we estimate it from training data? This paper addresses these questions for IRepLs in reproducing kernel Hilbert spaces (RKHS)s. Under the assumption that the distribution of a low-dimensional projection of high-dimensional data is approximately normal, we derive a closed-form solution for the global optima of the underlying optimization problem for encoders in RKHSs. This yields closed formulae for a near-optimal trade-off, corresponding optimal representation dimensionality, and the corresponding encoder(s). We also numerically quantify the trade-off on representative problems and compare them to those achieved by baseline IRepL algorithms.
translated by 谷歌翻译
Quantitative cephalometric analysis is the most widely used clinical and research tool in modern orthodontics. Accurate localization of cephalometric landmarks enables the quantification and classification of anatomical abnormalities, however, the traditional manual way of marking these landmarks is a very tedious job. Endeavours have constantly been made to develop automated cephalometric landmark detection systems but they are inadequate for orthodontic applications. The fundamental reason for this is that the amount of publicly available datasets as well as the images provided for training in these datasets are insufficient for an AI model to perform well. To facilitate the development of robust AI solutions for morphometric analysis, we organise the CEPHA29 Automatic Cephalometric Landmark Detection Challenge in conjunction with IEEE International Symposium on Biomedical Imaging (ISBI 2023). In this context, we provide the largest known publicly available dataset, consisting of 1000 cephalometric X-ray images. We hope that our challenge will not only derive forward research and innovation in automatic cephalometric landmark identification but will also signal the beginning of a new era in the discipline.
translated by 谷歌翻译
Cardiac resynchronization therapy (CRT) is a treatment that is used to compensate for irregularities in the heartbeat. Studies have shown that this treatment is more effective in heart patients with left bundle branch block (LBBB) arrhythmia. Therefore, identifying this arrhythmia is an important initial step in determining whether or not to use CRT. On the other hand, traditional methods for detecting LBBB on electrocardiograms (ECG) are often associated with errors. Thus, there is a need for an accurate method to diagnose this arrhythmia from ECG data. Machine learning, as a new field of study, has helped to increase human systems' performance. Deep learning, as a newer subfield of machine learning, has more power to analyze data and increase systems accuracy. This study presents a deep learning model for the detection of LBBB arrhythmia from 12-lead ECG data. This model consists of 1D dilated convolutional layers. Attention mechanism has also been used to identify important input data features and classify inputs more accurately. The proposed model is trained and validated on a database containing 10344 12-lead ECG samples using the 10-fold cross-validation method. The final results obtained by the model on the 12-lead ECG data are as follows. Accuracy: 98.80+-0.08%, specificity: 99.33+-0.11 %, F1 score: 73.97+-1.8%, and area under the receiver operating characteristics curve (AUC): 0.875+-0.0192. These results indicate that the proposed model in this study can effectively diagnose LBBB with good efficiency and, if used in medical centers, will greatly help diagnose this arrhythmia and early treatment.
translated by 谷歌翻译
Commonly adopted in the manufacturing and aerospace sectors, digital twin (DT) platforms are increasingly seen as a promising paradigm to control, monitor, and analyze software-based, "open", communication systems. Notably, DT platforms provide a sandbox in which to test artificial intelligence (AI) solutions for communication systems, potentially reducing the need to collect data and test algorithms in the field, i.e., on the physical twin (PT). A key challenge in the deployment of DT systems is to ensure that virtual control optimization, monitoring, and analysis at the DT are safe and reliable, avoiding incorrect decisions caused by "model exploitation". To address this challenge, this paper presents a general Bayesian framework with the aim of quantifying and accounting for model uncertainty at the DT that is caused by limitations in the amount and quality of data available at the DT from the PT. In the proposed framework, the DT builds a Bayesian model of the communication system, which is leveraged to enable core DT functionalities such as control via multi-agent reinforcement learning (MARL), monitoring of the PT for anomaly detection, prediction, data-collection optimization, and counterfactual analysis. To exemplify the application of the proposed framework, we specifically investigate a case-study system encompassing multiple sensing devices that report to a common receiver. Experimental results validate the effectiveness of the proposed Bayesian framework as compared to standard frequentist model-based solutions.
translated by 谷歌翻译
The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents. Skill reuse is one of the most common approaches, but current methods have considerable limitations.For example, fine-tuning an existing policy frequently fails, as the policy can degrade rapidly early in training. In a similar vein, distillation of expert behavior can lead to poor results when given sub-optimal experts. We compare several common approaches for skill transfer on multiple domains including changes in task and system dynamics. We identify how existing methods can fail and introduce an alternative approach to mitigate these problems. Our approach learns to sequence existing temporally-extended skills for exploration but learns the final policy directly from the raw experience. This conceptual split enables rapid adaptation and thus efficient data collection but without constraining the final solution.It significantly outperforms many classical methods across a suite of evaluation tasks and we use a broad set of ablations to highlight the importance of differentc omponents of our method.
translated by 谷歌翻译
Chromosome analysis is essential for diagnosing genetic disorders. For hematologic malignancies, identification of somatic clonal aberrations by karyotype analysis remains the standard of care. However, karyotyping is costly and time-consuming because of the largely manual process and the expertise required in identifying and annotating aberrations. Efforts to automate karyotype analysis to date fell short in aberration detection. Using a training set of ~10k patient specimens and ~50k karyograms from over 5 years from the Fred Hutchinson Cancer Center, we created a labeled set of images representing individual chromosomes. These individual chromosomes were used to train and assess deep learning models for classifying the 24 human chromosomes and identifying chromosomal aberrations. The top-accuracy models utilized the recently introduced Topological Vision Transformers (TopViTs) with 2-level-block-Toeplitz masking, to incorporate structural inductive bias. TopViT outperformed CNN (Inception) models with >99.3% accuracy for chromosome identification, and exhibited accuracies >99% for aberration detection in most aberrations. Notably, we were able to show high-quality performance even in "few shot" learning scenarios. Incorporating the definition of clonality substantially improved both precision and recall (sensitivity). When applied to "zero shot" scenarios, the model captured aberrations without training, with perfect precision at >50% recall. Together these results show that modern deep learning models can approach expert-level performance for chromosome aberration detection. To our knowledge, this is the first study demonstrating the downstream effectiveness of TopViTs. These results open up exciting opportunities for not only expediting patient results but providing a scalable technology for early screening of low-abundance chromosomal lesions.
translated by 谷歌翻译
Federated Learning (FL) is a scheme for collaboratively training Deep Neural Networks (DNNs) with multiple data sources from different clients. Instead of sharing the data, each client trains the model locally, resulting in improved privacy. However, recently so-called targeted poisoning attacks have been proposed that allow individual clients to inject a backdoor into the trained model. Existing defenses against these backdoor attacks either rely on techniques like Differential Privacy to mitigate the backdoor, or analyze the weights of the individual models and apply outlier detection methods that restricts these defenses to certain data distributions. However, adding noise to the models' parameters or excluding benign outliers might also reduce the accuracy of the collaboratively trained model. Additionally, allowing the server to inspect the clients' models creates a privacy risk due to existing knowledge extraction methods. We propose CrowdGuard, a model filtering defense, that mitigates backdoor attacks by leveraging the clients' data to analyze the individual models before the aggregation. To prevent data leaks, the server sends the individual models to secure enclaves, running in client-located Trusted Execution Environments. To effectively distinguish benign and poisoned models, even if the data of different clients are not independently and identically distributed (non-IID), we introduce a novel metric called HLBIM to analyze the outputs of the DNN's hidden layers. We show that the applied significance-based detection algorithm combined can effectively detect poisoned models, even in non-IID scenarios. We show in our extensive evaluation that CrowdGuard can effectively mitigate targeted poisoning attacks and achieve in various scenarios a True-Positive-Rate of 100% and a True-Negative-Rate of 100%.
translated by 谷歌翻译
通过查找图像可能不满意的图像来捕获对象检测器的错误行为,这一兴趣很长。在实际应用(例如自动驾驶)中,对于表征除了简单的检测性能要求之外的潜在失败也至关重要。例如,与远处未遗漏的汽车检测相比,错过对靠近自我车辆的行人的侦查通常需要更仔细的检查。在测试时间预测这种潜在失败的问题在文献和基于检测不确定性的传统方法中被忽略了,因为它们对这种错误的细粒度表征不可知。在这项工作中,我们建议将查找“硬”图像作为基于查询的硬图像检索任务的问题进行重新制定,其中查询是“硬度”的特定定义,并提供了一种简单而直观的方法,可以解决此任务大型查询家庭。我们的方法完全是事后的,不需要地面真相注释,独立于检测器的选择,并且依赖于有效的蒙特卡洛估计,该估计使用简单的随机模型代替地面真相。我们通过实验表明,它可以成功地应用于各种查询中,它可以可靠地识别给定检测器的硬图像,而无需任何标记的数据。我们使用广泛使用的视网膜,更快的RCNN,Mask-RCNN和CASCADE MASK-RCNN对象检测器提供有关排名和分类任务的结果。
translated by 谷歌翻译
这项研究旨在使用人工智能(AI)和多视图图像实现更可靠的自动化后建筑物损害分类。当前的实践和研究工作在采用AI进行灾后损害评估的AI方面通常是(a)定性,基于标准损害量表缺乏建筑物损害水平的精制分类,并且(b)基于空中或卫星图像培训,具有有限的视图,视图有限,尽管有指示性,但并不完全描述损伤量表。为了使损伤水平的更准确和可靠的自动量化量化,本研究提出了以多种地面和建筑物的空中视图形式使用更全面的视觉数据。为了具有这样的空间感知的损害预测模型,使用了多视图卷积神经网络(MV-CNN)体系结构,结合了损坏建筑物不同视图的信息。这种空间3D上下文损害信息将导致更准确地识别损害和可靠的损害水平量化。拟议的模型经过训练和验证,并在侦察视觉数据集上进行了验证,其中包含飓风哈维后检查的建筑物的专家标签,地理标记的图像。开发的模型在预测损害水平方面表现出合理的准确性,可用于支持更加知识和可靠的AI-AI-AS辅助灾害管理实践。
translated by 谷歌翻译
推荐兴趣点是一个困难的问题,需要从基于位置的社交媒体平台中提取精确的位置信息。对于这种位置感知的推荐系统而言,另一个具有挑战性和关键的问题是根据用户的历史行为对用户的偏好进行建模。我们建议使用Transformers的双向编码器表示的位置感知建议系统,以便为用户提供基于位置的建议。提出的模型包含位置数据和用户偏好。与在序列中预测每个位置的下一项(位置)相比,我们的模型可以为用户提供更相关的结果。基准数据集上的广泛实验表明,我们的模型始终优于各种最新的顺序模型。
translated by 谷歌翻译